Genome-wide identification and molecular evolution of Dof gene family in Camellia oleifera

DNA binding with one finger(Dof) gene family is a class of transcription factors which play an important role on plant growth and development. Genome-wide identification results indicated that there were 45 Dof genes(ColDof) in C.oleifera genome. All 45 ColDof proteins were non-transmembrane and non-secretory proteins. Phosphorylation site analysis showed that biological function of ColDof proteins were mainly realized by phosphorylation at serine (Ser) site. The secondary structure of 44 ColDof proteins was dominated by random coil, and only one ColDof protein was dominated by α-helix. ColDof genes’ promoter region contained a variety of cis-acting elements, including light responsive regulators, gibberellin responsive regulators, abscisic acid responsive regulators, auxin responsive regulators and drought induction responsive regulators. The SSR sites analysis showed that the proportion of single nucleotide repeats and the frequency of A/T in ColDof genes were the largest. Non-coding RNA analysis showed that 45 ColDof genes contained 232 miRNAs. Transcription factor binding sites of ColDof genes showed that ColDof genes had 5793 ERF binding sites, 4381 Dof binding sites, 2206 MYB binding sites, 3702 BCR-BPC binding sites. ColDof9, ColDof39 and ColDof44 were expected to have the most TFBSs. The collinearity analysis showed that there were 40 colinear locis between ColDof proteins and AtDof proteins. Phylogenetic analysis showed that ColDof gene family was most closely related to that of Camellia sinensis var. sinensis cv.Biyun and Camellia lanceoleosa. Protein-protein interaction analysis showed that ColDof34, ColDof20, ColDof28, ColDof35, ColDof42 and ColDof26 had the most protein interactions. The transcriptome analysis of C. oleifera seeds showed that 21 ColDof genes were involved in the growth and development process of C. oleifera seeds, and were expressed in 221 C. oleifera varieties. The results of qRT-PCR experiments treated with different concentrations NaCl and PEG6000 solutions indicated that ColDof1, ColDof2, ColDof14 and ColDof36 not only had significant molecular mechanisms for salt stress tolerance, but also significant molecular functions for drought stress tolerance in C. oleifera. The results of this study provide a reference for further understanding of the function of ColDof genes in C.oleifera. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-024-10622-6.


Introduction
Camellia oleifera, a native plant of China, is distributed in Guangdong, Hong Kong, Guangxi, Hunan and Jiangxi, China [1,2].It is a perennial wild shrub of Camellia Linn.Because its seeds can be squeezed into oil (tea oil) for consumption, it is named "tea" [3].Tea cake can be used as fertilizer, pesticide and feed; tea shell is an important raw material of activated carbon and tannin.C.oleifera can also be used for greening and beautifying the environment, or a good tree species for creating water and soil conservation forest, water conservation forest and biological fire prevention forest belt [4].Studies have shown that some plants of the tea group of Camellia are rich in caffeine and other purine alkaloids and tea polyphenols, which have great economic value.China is the distribution center of Camellia plants, with rich resources [5].Camellia plants mainly contain soap ridge, tannin and flavonoids, which have the effects of reducing blood sugar [6], weight loss [7], blood lipid regulation [8,9], antioxidant [10], antibacterial [11,12], anti-mutation [13] and so on.Among them, Camellia flower extract has been found to have certain whitening effect [14].And the fermentation products of Camellia endophytic fungi are expected to be used as raw materials for medicines, cosmetics and health care products [15], which has high research and development value.
Transcription factors (TFs) are proteins that can specifically bind to specific sequences upstream of genes, thereby ensuring that target genes are expressed at specific strengths at specific times and locations [16].As key regulatory proteins, TFs rarely function alone, and they usually recruit multiple TFs to achieve combined regulation of different metabolic pathways [17].Dof (DNAbinding with one finger) protein belongs to a class of subproteins in the zinc finger protein family, which is a trans-acting factor of single zinc finger structure and a plant-specific transcription factor, playing an important role in the growth and development of plants.Because it has a unique single zinc finger conservative DNA binding domain rich in Cys residues, it is named Dof domain [18][19][20].Dof domain is generally 200 ∼ 400 amino acids long and mainly consists of two regions: the conservative N-terminal and the variable C-terminal.The N-terminal contains a highly conserved Dof domain composed of 52 amino acids, in which the CX2CX21CX2C motif forms a single zinc finger structure.In this single zinc finger structure, 1 Zn2 + is covalently bound to 4 Cys residues.Zn2 + and Cys residues are the guarantee of Dof protein activity.The presence of bivalent ion chelators and any replacement of Cys residues will inactivate Dof protein.The Dof functional domain presents a C2C2 type zinc finger structure.Dof protein can not only bind to specific DNA sequences to regulate gene expression, but also interact with certain proteins to participate in the regulation of plant growth and development and abiotic stress response [21,22].The core sequence recognized by Dof protein is 5'-AAAG-3' or 5'-CTTT-3' [23]; its C-terminal amino acid sequence has a large variation and is the transcription regulatory domain of Dof protein [24], which can specifically recognize the cis-acting element with a sequence of 5'-AAAG-3' .Dof transcription factor plays an important regulatory role in the process of plant growth and development.
Dof transcription factors are involved in plant seed germination, tissue differentiation and widely involved in physiological and biochemical processes such as carbon and nitrogen metabolism [16,17].For example, in maize, the expression of ZmDof36 can promote the biosynthesis of grain starch, while inhibiting the synthesis of soluble sugar and reducing sugar [25].Dof transcription factors can not only respond to hormones and growth regulators, but also participate in light response.Dof transcription factors participate in defending cell-specific gene expression and participate in the process of plant morphological changes under adverse conditions [26,27].Studies have shown that ZmDof1 can inhibit the expression of Zm401, thereby affecting pollen-specific expression [28].NtBBF1 (rolB domain B factor 1) in Dof transcription factors of tobacco regulates tissue-specific expression and auxin-induced expression [29] by interacting with the ACTTTA region of the promoter of oncogene rolB in its apical meristem and microtubule tissues, and thereby affecting root development [29].PbDof9.2 in pears can regulate flowering time.Overexpression of PbDof9.2 in Arabidopsis can delay flowering time by interacting with the promoters of PbTFL1a and PbTFL1b [30].
In grain full stage of cereal crops, such as corn, rice, wheat, the identification of T G T A A A G sequences associated with grain full stage can regulate the biosynthesis of stored proteins and the expression of other proteins in grain full stage.Overexpression of Dof transcription factor SRF1 in sweet potato significantly inhibited the transcription of Ibβfruct2 gene.Thereby changing the carbon metabolism of sweet potato tubers and significantly reducing the accumulation of sucrose invertase have increased the starch content in sweet potato tubers by reducing the concentration of monosaccharides [31].Kushwaha et al. [32] found that ZmPBF transcription factor of Dof family in maize could specifically bind to P-box, the cis-element in the promoter of olysin gene, activate the transcription of the gene, and affect the endosperm protein content.Wu [25] et al. found that ZmDof36 in corn could positively regulate starch accumulation, and the expression of genes related to starch synthesis increased in overexpressed strains, and the starch content increased.In other species such as Arabidopsis thaliana [33], cotton [34] and Chlamydomonas reinhardtii [35], Dof family transcription factors have also been shown to increase lipid content in grain.In conclusion, transcription factors of Dof family play an important role in regulating seed storage protein and oil accumulation, and are the key to seed development.
In recent years, the progress of plant genome sequencing has greatly promoted the identification of Dof genes in many plants [47].However, the genome-wide structure and function of most Dofs remain to be elucidated, especially in C. oleifera, an important cash crop, and the identification and functional analysis of Dof transcription factors in C. oleifera genome are still blank.In this study, Dof transcription factors in C. oleifera genome were comprehensively identified, including chromosome localization, motif composition, gene structure, conserved domain, collinearity analysis and phylogenetic analysis.It is a foundation for the analysis of the potential role of Dof family members in the growth, development and tissue differentiation of C. oleifera.
In order to study the effects of drought stress and salt tolerance on C. oleifera, we conducted experiments on the root system of C. oleifera treated with different concentrations of salt and PEG6000.PEG6000 is a response experiment that simulates drought stress.At Jun 15th to July 15th, 2023, we conducted different treatment experiments on the roots of C. oleifera under different conditions in Jiajiang County, Leshan City, Sichuan Province.The first experimental treatment was as follows: the treatment group treated one C. oleifera foot with NaCl solutions at concentrations of 5.0, 10.0, and 15.0 g/L for 72 h, while the control group had a concentration of 0 g/L.The second experimental treatment is as follows: the treatment group is one C. oleifera foot at PEG6000 of 3%, 6%, 9% concentration for 72 h.while the control group had PEG6000 of 0% concentration.All young leaf samples of C. oleifera under different conditions were stored in liquid nitrogen before being transported back to the laboratory for storage in -80 °C freezer for DNA and RNA extraction.The total RNA was extracted from young leaves of the treatment group and the control group.Reverse transcription of purified RNA into cDNA using a reverse transcription kit, and reverse transcribed cDNA was used for qRT-PCR to verify the expression of Dof transcription factor family genes in C. oleifera.

Chromosomal location and gene structure analysis of ColDof genes
TBtools software was used to simplify and analyze the chromosomal location and gene structure of ColDof genes.TBtools software was used for chromosome localization and gene structure analysis maps.

Physical and chemical properties analysis of ColDof proteins
Protparam (http://web.expasy.org/protparam/)online tools was used to analyze the molecular weight of protein, isoelectric point and instability index, the total average hydrophobicity, liposoluble coefficient of ColDof proteins.

Conserved motif analysis of ColDof proteins
MEME (http://meme-suite.org/tools/meme)was used to analyze the conserved motif of ColDof proteins.The motif number of parameters was set to 10, and all other parameters were default settings.

SSR loci and microRNA prediction of ColDof genes' promoter
TBtools software was used to process the ColDof gene family sequences, and then the online tool IPK (https:// webblast.ipk-gatersleben.de/misa/)was used to predict the SSR locus in the promoter of ColDof genes.psRNA-Target (https://www.zhaolab.org/psRNATarget)online tool was used to predict the miRNAs of ColDof genes.

Cis-acting elements and transcription factor binding site analysis of ColDof genes
The 2000 bp upstream sequence of ColDof genes was extracted from C.oleifera genome sequences by using TBtools software based on the GFF3 file.PlantCARE online search tool (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) was used to predict the ciselements that may be involved in the regulation of ColDof gene expression in C. oleifera.The 2000 bp upstream of ColDof genes were used to analyze transcription factor binding sites of ColDof genes by PlantRegMap (http:// plantregmap.gao-lab.org/binding_site_prediction.php)online tool and TBtools software.

Codon preference analysis of ColDof gene family in C.oleifera
CodonW tool was used to analyze codon preference of ColDof gene family, and PR2.plot was used to analyze codon preference of ColDof gene family.

Collinearity analysis of ColDof gene family
The Fasta Stats tool in TBtools software was used to process the genome sequence and obtain the chromosome length file.Then One Step McScan-super Fast tool was used to compare C.oleifera protein itself, and blast results were obtained.Parse was also used to obtain the location of all ColDof genes based on GFF3 gene location information, and Advanced Circos was used to visualize the data.

Phylogenetic analysis of ColDof proteins
Clustal X was applied to sequence comparison of ColDof protein sequences of C.oleifera.The phylogenetic tree of ColDof proteins was constructed with the software MEGA7 and Neighbor-Joining method (NJ) was adopted.The verification parameter bootstrap was repeated for 1000 times, and other parameters were the default values.The evolutionary tree of C. oleifera and Arabidopsis thaliana was constructed with the same method.A phylogenetic tree of Dof proteins sequences of 9 species, including C.oleifera, Arabidopsis thaliana, Camellia lanceoleosa, Camellia sinensis var.assamica cv.Yunkang 10, Camellia sinensis var.sinensis cv.Longjing 43, Camellia sinensis var.sinensis cv.Shuchazao, Camellia sinensis var.sinensis cv.Tieguan Yin and Camellia sinensis ar.sinensis cv.Biyun, was constructed by Maxmumm Like-lihood (ML) method.The calibration parameter bootstrap was repeated 1000 times.

Protein-protein interaction analysis of ColDof proteins
ColDof protein sequences were uploaded to the interaction database String (https://string-db.org/) to analyze the protein-protein interaction of ColDof protein family.Reference species of ColDof proteins was set to "Arabidopsis thaliana", keep the remaining parameters set to default, store the results in TSV format, import the TSV file into Cytoscape 3.8.2, and analyze the network (Cytoscape → Tools → Network analyzer → Network analysis → Analyze network), save the network analysis results, and reflect the size of the Degree using node size and color.The larger the node, the greater the Degree value; The thickness of the edge was used to reflect the size of the Combine score.The thicker the edge, the larger the Combine score.The core target was selected to create a protein interaction network diagram.
All young leaf samples of C. oleifera under different conditions were stored in liquid nitrogen before being transported back to the laboratory for storage in -80 °C freezer for DNA and RNA extraction.The total RNA was extracted from young leaves of the treatment group and the control group by using plant RNA Extraction Kit from Beijing Tiangen Biotech Co., Ltd.Using the reverse transcription kit purchased from Beijing Tiangen Biotech Co., Ltd., the extracted and purified mRNA samples were reverse transcribed into cDNA.With the cDNA as a template, the 18S rRNA gene as an internal reference gene, and the designed qRT-PCR primers as a guide, the cDNA was subjected to PCR amplification under different NaCl and PEG6000 treatment conditions to obtain the expression levels of each ColDof gene.The relative expression level of the target gene ColDof was calculated by using the expression level of the reference gene as a reference.TBtools was used to process and heat map the obtained relative expression level of ColDof genes.All ColDof gene primers designed by TBtools Batch q-RT-PCR primer design tool used in the qRT-PCR validation experiment in this study are shown in Supplementary Table 1.The qRT-PCR primers used in this study were synthesized by Sangon Biotech (Shanghai) Co., Ltd on commission.

Results
Chromosomal location and gene structure analysis of ColDof genes 45 Dof genes have been identified in C. oleifera genome.They were named ColDoF1-ColDof45 according to their gene descriptions.ColDof3 ∼ ColDof6 and ColDof31 ∼ ColDof35 were located on chromosome 3 and had 9 genes.ColDof7 ∼ ColDof10 and ColDof42 ∼ ColDof44 were located on chromosome 4 and had 8 genes.ColDof37 and ColDof38 were located on chromosome 6 and had two genes.ColDof11 ∼ ColDof14 and ColDof39 were located on chromosome 7 and had 5 genes.ColDof16 ∼ ColDof19 and ColDof40 were located on chromosome 9 and had 5 genes.ColDof20 ∼ ColDof25, ColDof1 and ColDof2 were located on chromosome 10 and had 8 genes.ColDof26 and ColDof27 were located on chromosome 12 and had two genes.ColDof30 and ColDof41 were located on chromosome 15 and had two genes.ColDof45, ColDof15, ColDof28, and ColDof29 were located on chromosomes 5, 8, 13, and 14, respectively (Supplementary Tables 2 and Fig. 1).
Among 45 ColDof genes, 22 ColDof genes were composed of introns and exons, among which ColDof21 and ColDof23 contain one intron, and the other 20 contain two introns.Among 45 ColDof genes, the number of exons ranged from one to five, with 20 ColDof genes having one exon, 21 ColDof genes having two exons, two ColDof genes having three exons, and two ColDof genes having five exons.The family members with the most exons are ColDof21 and ColDof23 (Fig. 2).

Physical and chemical properties of ColDof proteins
According to the physicochemical properties of ColDof proteins, the highest number of amino acids in ColDof proteins was ColDof25.The amino acid number of  ColDof25 was 638 aa.The lowest number of amino acid in ColDof proteins was ColDof13.The amino acid number of ColDof13 was 120 aa.The overall amino acid content of ColDof proteins was between 120 aa and 638 aa.The average number of amino acids was 501 aa.The molecular weight of the protein ranges from 13.45278 to 70.04240 kD.The isoelectric point (pI) values ranged from 4.89 to 9.65, but 15 of the family members have theoretical isoelectric points less than 7, and the rest have theoretical isoelectric points greater than 7.The comparison of instability index values showed that the instability index values of ColDof06 and ColDof20 were lower than 40 and could be predicted to be stable proteins, while the instability index values of other sequences were all higher than 40 and were unstable proteins.The content of aliphatic index of ColDof ranged from 30.17 to 84.43, indicating that the thermal stability of this family of proteins varied greatly.Through the prediction of protein hydrophilicity/hydrophobicity, the GEAVY values of 45 ColDof were all negative, indicating that 45 ColDof members were hydrophilic protein members.The highest hydrophilic value was − 0.262, which was on ColDof12.The lowest hydrophilic value was − 1.191 in ColDof09.The total number of negatively charged residues in 15 of the 45 family members was greater than the total number of positively charged residues, indicating these 15 ColDof proteins were negative charge proteins.In 25 ColDof proteins, the total number of positively charged residues is greater than the total number of negatively charged residues, indicating these ColDof proteins were positive charge proteins.The total number of positively charged residues of the remaining five members was equal to the total number of negatively charged residues (Supplementary Table 3).

Signal peptide and subcellular localization of ColDof proteins
According to the prediction of ColDof protein signal peptide, 45 ColDof proteins had no signal peptide, which suggested that they were all non-secreted proteins.Subcellular localization prediction of 45 ColDof proteins showed that all 45 ColDof proteins were located on the nucleus (Supplementary Table 3).

Transmembrane structure, hydrophilicity and phosphorylation site of ColDof proteins
The transmembrane domain prediction analysis of ColDof proteins showed that none of 45 ColDof protein members had transmembrane phenomenon, so it was inferred that ColDof protein was non-transmembrane protein.
Hydrophilic/hydrophobic analysis showed that the maximum hydrophilic values of ColDof protein members ranged from 0.978 to 3.133, and the maximum hydrophilic values ranged from − 4.056 to -2.133.ColDof20 had a maximum value of 3.133, and ColDof39 had a minimum value of -4.056.According to the law that the lower the amino acid fraction, the stronger the hydrophilicity and the higher the fraction, the stronger the hydrophobicity, it can be seen that serine 27 on ColDof20 had the strongest hydrophobicity.Arginine, the 20th loci of ColDof39, had the strongest hydrophilic value, and as a whole, the hydrophilic value was more and more dense than the hydrophobic value.Therefore, the expression of ColDof protein was hydrophilic and it can be considered that ColDof protein was a hydrophilic protein (Supplementary Table 4).
The phosphorylation sites analysis of ColDof proteins showed that there were 2201 serine (Ser) phosphorylation sites, 763 threonine (Thr) phosphorylation sites and 211 tyrosine (Tyr) phosphorylation sites in 45 ColDof proteins.The serine (Ser) phosphorylation sites of ColDof34 were the largest, with 129.There were 15 members with the maximum value of 0.998 at the serine (Ser) site, and ColDof27 had the largest number of serine (Ser) phosphorylation sites (84).ColDof08 had a maximum value of 0.983 at the threonine (Thr) site, which was 170 threonine (Thr).ColDof10 had a maximum value of 0.978 at the tyrosine (Tyr) site, which was a serine (Ser) in the 211 position.The total number of phosphorylation sites of ColDof27 was 120.Member ColDof08 had the least phosphorylation sites (31).As the serine content is the highest in this family as a whole, we can infer that the protein functions mainly through phosphorylation at the serine (Ser) site (Table 1).
The tertiary structure of 45 ColDof proteins was predicted.According to the similarity of tertiary structure of each member, ColDof protein family could be divided into 40 categories.Among them, ColDof03 and ColDof04, ColDof16 and ColDof17, ColDof22 and ColDof24, ColDof37 and ColDof38 were the same category.And ColDof12 was different from the others in that its structure was mainly α-helix.From the tertiary  structure diagram, it can be seen that the random coil in the Dof family members accounted for a large part, while other structures were scattered in the protein structure (Supplementary Fig. 1), which was consistent with the predicted results of the secondary structure.

Conserved motif analysis of ColDof proteins
The  ,  3)

SSR loci analysis of ColDof genes' promoter
The SSR loci analysis showed that SSR locis of ColDof genes were rich in repeat types such as mononucleotide, dinucleotide, trinucleotide and complex nucleotide.The number of each repeat type varied greatly, but single nucleotide repeats dominated, accounting for 42.65% of all SSRs with a total length of 462 bp.The distribution of single nucleotide SSR sites showed A clear preference, and the number of SSR sites for motif A/T was 25, while that for motif C/G was only 4. Dinucleotide repeats accounted for 20.59% of all SSRs, with a total length of 348 bp, and the main motif type was CT/TC.Trinucleotide repeats accounted for 29.41% of all SSRs, with a total length of 345 bp.Compound repeat sequences accounted for 7.35% of all SSRs, and the total length was 475 bp.Overall, the length of SSR locis in most ColDof genes in C.oleifera was less than 50 bp, accounting for 92.65% of all SSRs (Supplementary Table 6).A cell cycle regulatory element was found in one family member, ColDof14.Hypoxia-specific induction elements were found in four ColDof genes, and only one was found in the promoter region of ColDof18, ColDof34, ColDof38 and ColDof40.Only one ColDof gene, ColDof1, was found to be injury-sensitive.ColDof genes contain a variety of cis-elements and these family members are expected to play a key role in the response of oil tea to environmental stress and hormonal control (Fig. 4).

Codon preference analysis of ColDof gene family in C.oleifera
The average content of the third codon is T3s > A3s > C3s > G3s.The average GC content (GC) of codons ranges from 0.34 to 0.58, with an average of 0.   8).There are 30 high-use codons (RSCU > 1), including 13 U terminals, 9 A terminals, 3 G terminals, and 5 C terminals (except stop codons UAA, UGA, and UAG, and start codons AUG and UGG).Of the 29 low-usage codons, 11 end in C, 10 end in G, 5 end in A, and 3 end in U.This indicates that the preference for high-usage codons ends at U and the preference for low-usage codons ends at C. In addition, the RSCU value of AGA is greater than 2, indicating a strong preference for this codon among members of ColDof gene family (Supplementary Table 9).

Transcription factor binding sites of ColDof genes
Transcription factor binding site analysis showed that all ColDof promoter regions had dense TFBSs distribution.According to the number of binding sites, we selected three basic TFBSs for demonstration, among which ERF was the largest with 5793 (Fig. 5A), followed by Dof with 4381 (Fig. 5B), MYB with 2206 (Fig. 5C), and BCR-BPC with 3702 (Fig. 5D).Three ColDof (ColDof9, ColDof39, and ColDof44) are expected to have the most TFBSs.The prediction of TFBSs provides a basis for further identification and verification of target genes.

Collinearity analysis of ColDof gene family
Gene replication can occur in a variety of ways.In the process of biological evolution, gene families are mainly amplified by fragment replication, tandem replication and whole genome replication, and the replicated genes control the physiological and morphological evolution of plants.The association among gene family members was further studied by comparing ColDof proteins, and 24 pairs of fragment replicators were found in 45 ColDof genes (Fig. 6; Table 2).Of the 15  ColDof genes of C.oleifera were hypothesized to have undergone a certain scale of fragment replication events during evolutionary development (Fig. 6).
Since the Ka/Ks ratio is a good indicator of the selection pressure occurring at the protein level, we used TBtools software to estimate the Ks(synonymous) and Ka(non-synonymous) values as well as the Ka/Ks ratio.Ka/Ks < 1, Ka/Ks = 1 and Ka/Ks > 1 are generally considered to represent negative, neutral and positive selection, respectively.The Ka/Ks of 24 duplicate ColDofs were all < 1, ranging from 0.13 to 0.43, indicating that all duplicate gene pairs were under strong purification selection, which is consistent with the observation of other plants, such as apple and tomato (Table 2).
In order to show the relationship between C.oleifera and tea tree more directly, the Dof gene family of tea tree diploid genome was collinear analysis, and 82  In order to further infer the origin and evolutionary mechanism of ColDof gene in C. oleifolia, the homology relationship between 45 ColDofs and other Camellia species was studied.These species include the haploid genome, diploid genome, Xiangye Camellia, Yunkang 10, Longjing 43, Shuchazao, Tieguanyin and Biyun, which have obvious collinearity with C. oleifera.It is worth noting that ColDofs of C. oleifolia, tea tree haploid genome and tea tree diploid genome were significantly stronger than other species, which may be related to the closely related species of C. oleifolia, tea tree haploid genome and tea tree diploid genome (Fig. 7).

Molecular evolution analysis of ColDof proteins
Through the evolutionary tree composed of 45 ColDof proteins, we can see that 45 ColDof proteins were clearly divided into 5 groups according to their degree of aggregation in the evolutionary tree (thus labeled Group1, Group2, Group3, Group4, Group5).In the whole family, Group1 had the most primitive evolutionary speed, including 7 ColDof proteins, among which ColDof14 had the slowest evolution, ColDof16 and ColDof17 had the fastest evolution.The second group with the fastest evolution speed was Group2, which contains 13 ColDof proteins, among which ColDof19 and ColDof11 evolve the slowest, and ColDof41 and ColDof27 evolve the fastest.Group5 was the fastest evolving branch, comprising 13 ColDof proteins, of which ColDof17 was the slowest, and ColDof18 and ColDof26 were the fastest (Fig. 8).
The phylogenetic tree of ColDof and AtDof gene families of Arabidopsis thaliana was divided into 7 subbranches (thus labeled Group1, Group2, Group3, Group4, Group5, Group6, Group7), each containing 2-46 members.In the whole family, Group1 was the most primitive evolutionary branch, including 4 members, including 1 ColDof member and 3 AtDof members, and ColDof35 was the slowest evolution; Second, Group2 had 7 members, including 5 ColDof members and 2 AtDof members.ColDof28 was the slowest, ColDof25 was the fastest, and ColDof34 was the same.Group3 contained two ColDof members and two AtDof members, and ColDof36 was the slowest evolution, AtDof6 and AtDof7 were the fastest evolution; Group4 contained 12 ColDof members and 15 AtDof members, and ColDof15 was the slowest in evolution, AtDof42, ColDof18, ColDof26 and ColDof6 with similar evolutionary speed, and ColDof30 was the fastest in evolution.At the same rate of evolution is AtDof37; Group5 contains two AtDof members, AtDof14 and AtDof15; Group6 contained ColDof1 and AtDof46; Finally, Group7 was the fastest evolving branch, including 24 ColDof members and 22 AtDof members, and ColDof12 was the slowest evolving, ColDof27 was the fastest evolving, and AtDof40 and AtDof41 were similar to its evolving speed.Overall, Group1 was the most original, while Group6 was the fastest growing (Supplementary Fig. 5A).
In order to further study the evolutionary relationship of Dof genes in C.oleifera, the phylogenetic trees of ClaDof and ColDof gene families were divided into 6 subbranches (thus labeled Group1, Group2, Group3, Group4, Group5, Group6).Each subbranch contains 6-41 members.In the whole family, Group1 had the most primitive evolutionary speed, including 12 members, including 4 ColDof members and 8 ClaDof members, and ColDof15 had the slowest evolutionary speed, and ClaDof8 and ColDof15 had the same slow evolutionary speed.Group2 had 25 members, including 10 ColDof members and 15 ClaDof members, and ColDof20 had the slowest evolution; ClaDof52 and ColDof52 had the same slow evolution speed; ColDof45 had the fastest evolution speed.ClaDof20, ClaDof40, ClaDof35 and ClaDof38 had similar evolutionary speed.Group3 contained 3 ColDof members and 4 ClaDof members, and ColDof18 was the slowest in evolution, ClaDof9 and ColDof18 had the same slow evolution speed, ColDof26 had the fastest evolution speed, and ClaDof2 and ClaDof47 had similar evolution speed.Group4 contained 3 ColDof members and 3 ClaDof members, and ColDof7 was the slowest in evolution, ClaDof26 was the same as its evolution speed, and the fastest in evolution were ColDof9, ColDof44, ClaDof6 and ClaDof22.Group5 contained 4 ColDof members and 5 ClaDof members, and ColDof43 was the slowest in evolution, ClaDof24 was the same as ColDof43 in evolution speed, and ColDof36 was the fastest in evolution speed, and ClaDof5 and ClaDof23 were similar in evolution speed.Finally, Group6 had the fastest evolutionary speed, including 21 ColDof members and 20 ClaDof members, and ColDof1 was the slowest evolutionary speed, ClaDof51 and ColDof1 had the same evolutionary speed, ColDof8 had the fastest evolutionary speed, ClaDof22 had the same evolutionary speed.Overall, Group1 was the most original, while Group6 was the fastest growing (Supplementary Fig. 5B).
To study the evolutionary relationship of Dof genes among different teas, The phylogenetic tree of ColDof and CasDof gene families of Pu-erh tea was divided into 8 subbranches (thus labeled Group1, Group2, Group3, Group4, Group5, Group6, Group7, Group8), each containing 1-36 members.In the whole family, Group1 was the most primitive evolutionary branch, including one member ColDof1; Group2, which has two ColDof members and one CaSDof member, had the slowest evolution, and CaSDof16 had the slowest evolution, while ColDof45 and ColDof36 had the fastest evolution.Group3 contains 5 ColDof members and 5 CaSDof members, and ColDof5 was the slowest in evolution, CaSDof4 was the same as ColDof5 in evolution speed, and ColDof29 was the fastest in evolution speed, and CaSDof29 was the same in evolution speed.Group4 contained two ColDof members and three CaSDof members, and ColDof15 was the slowest in evolution, CaSDof39 was the same as the slow in evolution, ColDof33 was the fastest in evolution, CaS-Dof27 and CaSDof1 were the same in evolution.Group5 contained 3 ColDof members and 1 CaSDof member, and ColDof44 and ColDof9 were the slowest in evolution, ColDof7 was the fastest in evolution, and CaSDof32 was the same in evolution speed.Group6 contained 10 ColDof members and 9 CaSDof members, and ColDof26 was the slowest in evolution, CaSDof19 was the same as ColDof26 in evolution speed, and ColDof31 and ColDof32 were the fastest in evolution.Just as fast are CaSDof2 and CaSDof26; Group7 contained two ColDof members and three CaSDof members, and ColDof10 has the slowest evolution, CaSDof30 has the same slow evolution, ColDof43 had the fastest evolution, and CaS-Dof33 has the same fast evolution.Finally, Group8 was the fastest evolving branch, which contained 20 ColDof members and 17 CaSDof members, and ColDof11 was the slowest evolving branch, CaSDof36 and ColDof11 are the same evolving speed.The most rapidly evolving were ColDof21, ColDof22, ColDof23 and ColDof24, with CaS-Dof38 evolving just as fast.In summary, ColDof1 was the most original team, while Group8 was the fastest growing team (Supplementary Fig. 5C).
The phylogenetic tree of ColDof and GWHDof gene families of C. sinensis var.Longjing 43 was divided into 8 subbranches (thus labeled Group1, Group2, Group3, Group4, Group5, Group6, Group7, Group8).Each subbranch contained 2-37 members.In the whole family, Group1 was the most primitive evolutionary speed, including two ColDof members, ColDof31 and ColDof32, and one GWHDof member, GWHDof20, and all three members evolve at the same slow speed.Group2, which had two ColDof members and two GWHDof members, had the slowest evolution, and GWHDof6 had the slowest evolution, while ColDof25 and ColDof34 had the fastest evolution, and GWHDof22 had the same evolution speed.Group3 contained a ColDof member, ColDof28, and a GWHDof member, GWHDof4, both of which evolve at the same speed.Group4 contains one ColDof member and two GWHDof members, and GWHDof1 was the slowest in evolution, ColDof1 was the fastest in evolution, and GWHDof2 was the same in evolution speed.Group5 contained two ColDof members, and ColDof35 and ColDof12 evolve at the same speed.Group6 contained 13 ColDof members and 12 GWHDof members, and ColDof29 and ColDof30 were the slowest in evolution, and GWHDof30, GWHDof7 and ColDof20 were similar in evolution speed, and ColDof7 was the fastest in evolution.Just as fast were GWHDof33 and GWHDof34; Group7 contains two ColDof members and two GWHDof members, and ColDof42 was the slowest in evolution, GWHDof29 was the same in evolution, ColDof36 was the fastest in evolution, and GWHDof26 is the same in evolution speed.Finally, Group8 was the fastest evolving group, consisting of 22 ColDof members and 15 GWHDof members, and ColDof11 was the slowest evolving group, with similar evolutionary speed including GWHDof12, GWHDof11 and ColDof19.The most rapidly evolving were ColDof37 and ColDof38, and as rapidly evolving are GWHDof9; In summary, ColDof1 was the most original team, while Group8 was the fastest growing team (Supplementary Fig. 5D).
The phylogenetic tree of ColDof and CSSDof gene families of Camellia sinensis var.sinensis cv.Shuchazao was divided into 6 subbranches (labeled Group1, Group2, Group3, Group4, Group5, Group6), each containing 1-40 members.In the whole family, Group1 had the most primitive evolutionary speed, including 10 members, including 7 ColDof members and 3 CSSDof members, and ColDof28 was the slowest evolutionary speed, and CSSDof32 was the same as its evolutionary speed.Group2, which has 31 members, including 15 ColDof members and 16 CSSDof members, and ColDof20, which had the slowest evolution, had similar evolutionary speed with CSSDof36, CSSDof40, ColDof30 and ColDof29.ColDof5 evolved the fastest, while CSSDof17 evolved at the same rate.Group3 contained one CSS-Dof member, CSSDof23.Group4 contained one ColDof member, ColDof1, and one CSSDof member, CSSDof37, both of which evolve at the same speed.Group5 contained two ColDof members and two CSSDof members, and ColDof19 was the slowest in evolution, CSSDof12 and ColDof19 have the same slow evolution speed, and ColDof11 had the fastest evolution speed, and CSSDof41 has the same evolution speed.Finally, Group6 was the fastest evolutionary group, which contained 20 ColDof members and 20 CSSDof members, and ColDof13 was the slowest evolutionary group, and its evolutionary speed was similar to that of CSSDof1, ColDof17 and ColDof16.The most rapidly evolving were ColDof37 and ColDof38, with CSSDof35 and CSSDof38 evolving just as fast (Supplementary Fig. 5E).
The phylogenetic tree of ColDof and SIVDof gene families of Camellia sinensis var.Sinensis Tieguanyin was divided into 6 subbranches (thus labeled Group1, Group2, Group3, Group4, Group5, Group6), each containing 3-37 members.In the whole family, Group1 had the most primitive evolutionary speed, including 4 members, including 2 ColDof members and 2 SIVDof members, and ColDof33 had the slowest evolution, and SIVDof13 and ColDof33 had the same evolutionary speed.Group2 has 19 members, including 8 ColDof members and 11 SIVDof members, and ColDof18 had the slowest evolution, SIVDof41 had the same slow evolution as ColDof18, and ColDof29 has the fastest evolution.SIVDof39 and SIVDof38 had similar evolutionary speed.Group3 contained 8 ColDof members and 8 SIVDof members, and ColDof1 was the slowest in evolution, SIVDof37 was the same as ColDof1 in evolution speed, and ColDof25 was the fastest in evolution speed, and SIVDof1 is the same in evolution speed.Group4 contains two ColDof members and one SIVDof member, and ColDof36 was the slowest in evolution, while ColDof42 and SIVDof22 are the fastest in evolution.Group5 contains 5 ColDof members and 4 SIVDof members, and ColDof11 was the slowest in evolution, SIVDof6 and ColDof11 had the same slow evolution speed, ColDof9 had the fastest evolution speed, and SIVDof9 had the same evolution speed.Finally, Group6 was the fastest evolving group, containing 20 ColDof members and 17 SIVDof members, and ColDof13 was the slowest evolving group, with similar evolutionary speed including SIVDof40, ColDof16 and ColDof17.The most rapidly evolving were ColDof21 and ColDof23; Overall, Group1 was the most original, while Group6 was the fastest growing (Supplementary Fig. 5F).
The phylogenetic trees of ColDof and MJRDof gene families were divided into 10 subbranches (thus labeled Group1, Group2, Group3, Group4, Group5, Group6, Group7, Group8, Group9, Group10).Each subbranch contains 1-29 members.In the whole family, Group1 was the most primitive evolutionary branch and contained one MJRDof member.The next group that evolved faster was Group2, which included one ColDof member, ColDof18.Group3 contained two ColDof members and two MJRDof members, and ColDof6 was the slowest in evolution, MJRDof5 is the same as ColDof6 in evolution speed, ColDof26 was the fastest in evolution speed, and MJRDof4 is the same in evolution speed.Group4 contained three ColDof members and one MJRDof member, and ColDof30 and ColDof29 evolve the slowest, while ColDof20 and MJRDof2 evolve the fastest.Group5 contained one ColDof member and one MJRDof member, and MJRDof20 evolves at the same speed as ColDof15.In Group6, there were 6 ColDof members and 2 MJRDof members, and ColDof7 was the slowest in evolution, ColDof44 and ColDof9 were similar in evolution speed, ColDof45 was the fastest in evolution, and MJRDof15 was the same in evolution speed.There were 8 ColDof members and 6 MJRDof members in Group7, and ColDof36 was the slowest in evolution, MJRDof8 was the same in evolution speed, ColDof25 was the fastest in evolution speed, MJRDof3 was the same in evolution speed.Group8 contained 3 ColDof members and 1 MJRDof member, and ColDof1 was the slowest evolution, while ColDof11 and MJRDof16 were the fastest evolution.Group9 contains 10 ColDof members and 1 MJRDof member, and ColDof43 was the slowest in evolution, MJRDof9 and ColDof10 were similar in evolution speed, and ColDof16 and ColDof17 were the fastest in evolution speed.Finally, Group10 was the fastest evolving group, including 11 ColDof members and 8 MJRDof members, and ColDof40 was the slowest evolving group, with the same evolutionary speed as MJRDof22, and the fastest evolving group was ColDof21.As fast as ColDof21 which was evolving, was MJRDof13.Overall, Group1 was the most original, while Group10 is the fastest growing (Supplementary Fig. 5G).
In order to more intuitively show the evolutionary relationship between oil tea and tea tree, The phylogenetic tree of the tea diploid genome ColDof and DipDof gene families was divided into 8 subbranches (thus labeled Group1, Group2, Group3, Group4, Group5, Group6, Group7, Group8), each containing 1-41 members.In the whole family, Group1 was the most primitive evolutionary speed, including one DipDof member DipDof8; Second, Group2 had a slightly faster evolutionary speed, including two ColDof members and one DipDof member.Among them, ColDof11 had the fastest evolutionary speed, and ColDof19 had the slowest evolutionary speed, and DipDof41 had the same evolutionary speed.Group3 contained one ColDof member, ColDof1, and one DipDof member, DipDof36.Group4 contained 5 ColDof members and 2 DipDof members, and ColDof14 and Dip-Dof9 evolve the slowest, while ColDof16 and DipDof38 evolve the fastest.Group5 contained two ColDof members and two DipDof members, and DipDof10 evolves as fast as ColDof3.Group6 contains 3 ColDof members and 3 DipDof members, and ColDof2 was the slowest evolving, DipDof4 is the same evolving speed, ColDof10 was the fastest evolving speed, DipDof24 was the same evolving speed.There are 11 ColDof members and 9 Dip-Dof members in Group7, and ColDof40 was the slowest in evolution, and DipDof40 was the same in evolution speed, ColDof27 was the fastest in evolution speed, and DipDof35 is the same in evolution speed.Group8 contained two ColDof members and two DipDof members, and ColDof42 and DipDof22 evolve the slowest, while ColDof36 and DipDof26 evolve the fastest.In Group9, there were 6 ColDof members and 5 DipDof members, and ColDof7 was the slowest in evolution, DipDof21 was the same in evolution speed, and ColDof5 and DipDof11 were the fastest in evolution.Finally, Group10 was the fastest evolving group, including 13 ColDof members and 15 DipDof members, and ColDof15 was the slowest evolving group, which had the same evolutionary speed as DipDof31, and the fastest evolving group is ColDof31.As fast as ColDof31, which was evolving was DipDof13.Overall, Group1 was the most original, while Group10 was the fastest growing (Supplementary Fig. 5H).
The phylogenetic trees of ColDof and HapDof gene families were divided into 8 subbranches (thus labeled Group1, Group2, Group3, Group4, Group5, Group6), each containing 6-54 members.In the whole family, Group1 has the most primitive evolutionary speed, including 13 members, including 3 ColDof members and 10 HapDof members, and ColDof20 had the slowest evolution, ColDof30 had the fastest evolution speed, and HapDof64 and HapDof62 had the same evolutionary speed.Group2 had 16 members, including 6 ColDof members and 10 HapDof members.ColDof28 is the slowest, and HapDof56 and HapDof57 had the same speed of evolution.The most rapidly evolving were ColDof31, ColDof32, HapDof19, and HapDof25.In Group3, there were 5 ColDof members and 10 Hap-Dof members, and ColDof36 was the slowest in evolution, and HapDof43 and HapDof51 were similar in evolution speed, and ColDof6 was the fastest in evolution.HapDof18 and HapDof24 evolve at a similar rate.In Group4, there were 7 ColDof members and 11 HapDof members, and ColDof15 was the slowest, with the same evolutionary speed as HapDof59, and ColDof5 was the fastest, with the same evolutionary speed as HapDof17 and HapDof23.Group5 contained two ColDof members and four HapDof members, and ColDof11 was the slowest to evolve, with HapDof73, HapDof77 and ColDof19 evolving equally fast.Finally, Group6 is the fastest evolving team, including 22 ColDof members and 32 HapDof members, and ColDof1 was the slowest evolving team, with the same evolutionary speed as HapDof66 and Hap-Dof67.The most rapidly evolving were ColDof41, Hap-Dof61, and HapDof63; Overall, Group1 was the most original, while Group6 was the fastest growing (Supplementary Fig. 5I).

Protein-protein interaction analysis of ColDof protein family
Import ColDof protein sequence file into String (https:// string-db.org/)Perform protein-protein interaction prediction in the database, set the species to "Arabidopsis thaliana", save the network file in TSV format, import the TSV file into Cytoscape 3.8.2software to draw the protein-protein interaction network, perform topology analysis on the network, reflect the size and color of the target with degree values, and reflect the thickness of the edges with combined score values, thereby constructing a protein-protein interaction network.The network consists of 21 nodes and 101 edges, with ColDof34, ColDof20, ColDof28, ColDof35, ColDof42, and ColDof26 as core targets.And ColDof34 had the most protein interactions, Protein-protein interaction analysis showed that ColDof34, ColDof20, ColDof28, ColDof35, ColDof42 and ColDof26 have the most protein interactions (Fig. 10).The qRT-PCR results of the leaves of C. oleifera treated with NaCL solutions of different concentrations(5 g/L, 10 g/L,15 g/L) for 72 h showed that 45 ColDof genes were expressed in both the control group and the experimental group.Compared with the control group (without NaCl solution treatment:0 g/L), as the concentration of NaCl solution in the treatment group increased, the expression levels of most ColDof genes significantly decreased, while only the expression levels of ColDof1, ColDof2, ColDof14 and ColDof36 genes significantly increased.This indicates that ColDof1, ColDof2, ColDof14 and ColDof36 may be involved in the response to salt stress in C. oleifera (Fig. 12).

Gene expression analysis of
The qRT-PCR results of the leaves of C. oleifera treated with PEG6000 solutions of different concentrations(3%, 6%, 9%) for 72 h showed that 45 ColDof genes were expressed in both the control group and the experimental group.Compared with the control group (without PEG6000 solution treatment:0%), as the concentration of PEG6000 solution in the treatment group increased, the expression levels of most ColDof genes significantly decreased, while only the expression levels of ColDof1, ColDof2, ColDof5 ColDof14,ColDof27 and ColDof36 genes significantly increased.This indicates that ColDof1, ColDof2, ColDof5 ColDof14,ColDof27 and ColDof36 may be involved in the response to drought stress in C. oleifera (Fig. 13).In this study, Dof protein sequence information of C.oleifera was used to comprehensively analyze Dof family of C.oleifera, and 45 ColDof genes were identified.Analysis of their domains and motifs showed that all of them contained complete C2-C2 single zinc finger structure, indicating that the Dof transcription factor family was conservative in the process of species evolution.
Studies have shown that the theoretical isoelectric points of ColDof proteins in different plants are basically 5.41 ∼ 6.97, and the number of basic amino acids is generally higher than that of acidic amino acids.However, in this study, the isoelectric points of ColDof proteins in C.oleifera were mainly concentrated in 4.89 ∼ 9.65, and most of them were alkaline amino acids, indicating that the isoelectric points of Dof family members of different plants were very different (Supplementary Table 2).According to the prediction analysis of amino acid transmembrane structure, hydrophilicity and phosphorylation sites of Dof family members, it can be concluded that these 45 ColDof proteins were non-transmembrane proteins, and all proteins were hydrophilic proteins.Their protein function was mainly achieved by phosphorylation at the serine site (Table 1).Subcellular localization prediction results showed that ColDof genes were all located in the nucleus, and if they were located on the cell membrane, they might be expressed in organelles such as chloroplasts and Golgi apparatus, or in the cytoplasm, indicating that the functions of these genes might also be specific [54].So far, there were relatively few reports on the existence of signaling peptides in Dof gene.A signal peptide was a chain of peptides in a protein molecule that has the ability to transmit signals outside or inside the cell.In soybean GmDof genes, the promoter region contains a conserved region that may have a potential signal peptide sequence.However, the prediction of Dof gene signal peptide in C.oleifera showed no signal peptide.
According to predicted results of secondary structure, 44 ColDof proteins mainly had random coil, and the contents of α-helix, extended chain structure and β-turn are less, and the order of secondary structure component content of each ColDof protein was random coil > α-helix > extended chain > β-turn.Only ColDof12 was dominated by α-helix, which is manifested as α-helix > random coil > extended chain > β-turn (Supplementary Table 4, Fig. 3).The images of the tertiary structure were consistent with the results of the secondary structure prediction.In addition, a total of 10 independent conserved motifs were identified by analyzing 45 ColDof proteins using the online tool MEME.motif1, characterized by a single zinc finger structure (C2-C2), was a core component of Dof protein in C.oleifera and is present in 43 ColDof proteins.
The cis-acting elements of promoter region regulate accurate initiation and transcription efficiency of gene transcription by binding with transcription factors, and can identify the core region of transcription activation [55,56].ColDof gene family contained a large number of cis-acting elements related to photosensitivity, hormonal response, biological and abiotic stress response, which are speculated to play a role in growth and development, stress tolerance and hormone signaling of C. oleifera.This was consistent with the study results of Merlino et al. [57] on Dof gene family in barley.This research results were also consistent with the study results of Song et al. [58] on Dof gene family in Helianthus annuus.These results were also consistent with study results of Luo et al. [59] on Dof gene family in Camelina sativa.
Codons that code for the same amino acid were called synonymous codons, and they were used at different frequencies during translation.This unbalanced codon use phenomenon was called codon use bias [60].The study of codon preference was conducive to the exploration of genetic evolution, understanding of gene expression characteristics, and providing guidance for molecular breeding.In this study, the codon use preference analysis showed that 45 ColDof genes had a slight preference for codon selection.Only AGA has an RSCU greater than 2, indicating a strong preference for this codon among ColDof genes.The average content of GC3s and GC was less than 50%, and the high-use codon preference ends with A/U(T), indicating that A/U(T) was used more frequently in the coding sequence codon than G/C.These results were consistent with the results of Wang et al. [61] on the codon preference in chloroplast genome of theaceae.
MiRNA regulated a variety of genes and participates in a variety of biological processes, indicating the complexity of miRNA regulation of target genes [62].It was found that ath-miR5021 had the largest number of target genes, and most miRNA maturation sequences (5'-3') were 20 bp in length.As a codominant genetic marker with high polymorphism, good repeatability and strong specificity, SSR played an important role in the analysis of species genetic diversity, the comparison of relatives and the construction of genetic maps [63].In this study, multiple SSR loci of various types were screened, which could provide data reference for further development of specific SSR markers and genetic diversity analysis of C. oleifera [64].The analysis of SSR sites by IPK online tool software showed that the proportion of single nucleotide repeats was the largest, and the frequency of A/T was also the largest.CT/TC is the main motif of dinucleotide.Except for complex nucleotides, the types of SSR motifs increased with the increasing number of motifs, while the number of SSR loci decreased with the increasing number of motifs.
When plants are exposed to hypothermia stress, the hypothermia receptors in cells can rapidly sense the ambient temperature, and then transmit the information to the nucleus through various transduction pathways [65].TFs genes that can respond to hypothermia stress in cells begin to express, thereby regulating the expression of downstream related genes, and ultimately affecting the plant response to low temperature [66].At present, a variety of TFs involved in the regulation of plant hypothermia response have been identified, such as AP2/ERF [67], bHLH [68], MYB [69] and other TFs family members.In this study, 29,912 TFs sequences belonging to 43 TFs families were identified, among which ERF, MYB and Dof were the most abundant TFs families.The AP2/ ERF family is one of the largest transcription factor families in the plant kingdom, and its members can participate in plant response to low temperature and enhance plant cold resistance by regulating the expression of downstream target genes [70].Dof family members were widely involved in plant response to low temperature stress, and overexpression of homologous genes encoding Dof can improve the cold tolerance of transgenic plants.Previous studies have shown that grape VaDof17d gene played a positive role in grape cold tolerance and may be an important candidate gene for molecular breeding of cold resistance [71].Therefore, it was speculated that TFs such as ERF and Dof played an important role in the cold tolerance of C. oleifera.
Through collinearity analysis, it was found that there were multiple homologous Dof gene pairs between each of two genomes of C. oleifera, Yunkang 10, Longjing 43, Shuchazao, Tieguanyin, Biyun, Camellia haploid and Camellia diploid, indicating high collinearity between C. oleifera and these genomes.Many plant gene families evolve and expand due to gene replication events, which may also facilitate the formation of new functional genes and species that are better able to withstand harsh environments as plants evolve [72].Numerous studies have shown that genes generated through fragment replication events may be more likely to be preserved due to subfunctionalization without increasing the likelihood of gene rearrangement [73].Previous collinearity studies of Dof gene families in Tartary buckwheat [74], rose [75], and cotton [76] have shown that segmental repeat events play a dominant role in Dof gene expansion.Similarly, no tandem repeat events were observed in ColDof genes in C.oleifera, and fragment repeats were the primary cause of their amplification, suggesting that some ColDof genes may have originated from genetic repeats in C.oleifera.However, studies have also shown that tandem repeats and segmentary repeats exist in both Dof transcription factors in Brassica napus [38] and poplar [77].Previous studies have shown that phylogenetic tree can provide valuable theoretical basis for function prediction of similar genes in different species, that is, genes clustered in the same group in phylogenetic evolution are relatively conserved in terms of gene structure, protein conserved motif, gene expression pattern, etc.Therefore, genes in the same group may have similar biological functions [78].The Dof gene family of Yunkang No. 10, Longjing No. 43, Shuchazao, Tieguanyin, Biyun, etc. were closely related to C.oleifera, because these are all Camellia species of Camellia family.Among them, ColDof11 was most closely related to CSSDof46.It was most closely related to CSSDof31, a member of Stenophyllum camellia.
Zhang et al. [79] reported that CsYABBY10 and CsY-ABBY5 genes in tea trees have significant drought and salt tolerance functions.CoYABBY gene family in C. oleifera genome has significant salt tolerance functions, and CoYABBY3 gene has the strongest salt tolerance function.So far, no research reports have been found on the salt and drought tolerance of Dof gene family in C. oleifera genome.However, there are currently many research reports on the salt stress tolerance of Dof gene family in plant genomes.Li et al. [80] reported that the silencing of Dof1.7 gene in the cotton genome significantly reduces the mechanism of cotton's salt stress response, indicating that Dof1.7 in cotton genome has a significant salt stress tolerance function.Nan et al. [81] reported that RchDof9, RchDof10, RchDof17 and RchDof20 genes in Rosa chinensis genome exhibit significant molecular mechanisms underlying salt stress tolerance responses.Zhou et al. [82] reported that the ClDof29 gene in watermelon has significant salt tolerance.The above indicates that Dof gene family in the plant genome has a certain molecular mechanism of salt stress tolerance, which is similar to the results found in this study that ColDof1, ColDof2, ColDof14 and ColDof36 have significant salt tolerance.
Yu et al. [83] reported that most members of Dof gene family in the tea plant genome have a molecular mechanism for drought resistance.Sun et al. [84] reported that BpDof4, BpDof11 and BpDof17 in the Betula platyphylla genome exhibit significant molecular mechanisms of drought stress tolerance.Chen et al. [85] reported that the MdDof54 gene in the apple genome exhibits significant drought resistance.The above research results are similar to the findings of this study, which suggest that ColDof1, ColDof2, ColDof5, ColDof14, ColDof27 and ColDof36 may be involved in their response to drought stress.
The results of this study provide a reference for further research on the biological functions of Dof gene family in C.oleifera during its growth and development.

Conclusion
In this study, we have identified 45 ColDof proteins in C.oleifera genome.All the 45 ColDof members are non-transmembrane and non-secretory proteins.The biological function of ColDof proteins was mainly realized by phosphorylation at serine (Ser) site.ColDof genes' promoter contains a variety of cis-acting element elements, including light response, gibberellin response, abscisic acid response, auxin response and drought induction elements.ColDof gene family was most closely related to that of diploid tea tree and Camellia lanceoleosa.There were 40 colinear locis between ColDof with Dof protein of Arabidopsis thaliana.ColDof34, ColDof20, ColDof28, ColDof35, ColDof42 and ColDof26 have the most protein interactions.Moreover, ColDof1, ColDof2, ColDof14 and ColDof36 not only have significant molecular mechanisms for salt stress tolerance, but also significant molecular functions for drought stress tolerance.This study systematically identified the genetic characteristics, protein characteristics, and molecular evolutionary relationships of Dof gene family in C. oleifera genome, and elucidated the involvement of most ColDof genes in the growth and development process of C. oleifera, especially in the response to salt stress and drought stress of C. oleifera.The results of this study provide a reference for further understanding of the function of ColDof genes in C.oleifera.

Fig. 1
Fig. 1 Chromosome mapping of Dof gene family in C.oleifera

Fig. 2
Fig. 2 Gene structure of Dof family in C.oleifera

Fig. 3
Fig. 3 Secondary structure of Dof family in C.oleifera

Fig. 5
Fig. 5 The ERF (A), Dof (B), MYB (C) and BBR-BRC (D) TF binding sites in the promoter region of the ColDof genes

Fig. 13 Fig. 12
Fig. 13 Gene expression analysis of ColDof genes under different concentration PEG6000 by qRT-PCR experiment

Table 1
Phosphorylation sites analysis of ColDof proteins online tool MEME was used to analyze 45 ColDof proteins' sequences, 10 independent conserved motifs were identified.The 10 conserved motifs ranged in length from 15 to 50 amino acids.According to different Dof proteins have different number and types of motif, it is speculated that the reason may be because different Dof proteins play different functions in vivo.In 45 ColDof proteins, each ColDof protein contained at least 1 con- served motifs, 3 ColDof proteins contained 1 conserved motifs, and 4 ColDof proteins contained 10 conserved motifs.There are 32 ColDof proteins with 2 conserved motifs, 2 ColDof proteins with 5 conserved motifs, 1 ColDof protein with 6 conserved motifs, and 3 ColDof proteins with 7 conserved motifs.(Supplementary Figs. 2

ColDof genes in C. Oleifera
the mature sequence of miRNA with length of 21 bp accounted for 8.81% of all sequences (Supplementary Table7).
(2)get ColDof genes.And there was only one miRNA in 90 (ath-miR156c-3p, ath-miR156d-3p, ath-miR156f-3p, ath-miR161.2,etc.).The length of miRNA maturation sequence (5'-3') was mainly 20 bp, accounting for 67.54% of all sequences.The mature sequence of miRNA with length of 19 bp accounted for 19.90% of all sequences, and Anaerobic induction elements were found in 39 ColDof genes, among which ColDof19 promoter region had the most anaerobic induction elements(5).Gibberellin response elements were found in 24 ColDof genes, among which the promoter region of ColDof29 had the most gibberellin response elements (3).Salicylic acid response elements were found in 25 family members, among which ColDof8, ColDof11, ColDof21, ColDof22 and ColDof38 had the most salicylic acid response elements(2).Auxin response elements were found in 17 family members, and 5 auxin response elements in ColDof45 promoter region were the highest.The metabolic elements of corn protein were found in 23 family members, among which ColDof4 and ColDof8 promoter regions were the most photoresponsive elements(3).Circadian control elements were found in 5 ColDof genes, among which ColDof17 promoter region had the most circadian control elements(2).Seed-specific regulatory elements were found in four family members, and only one was found in the promoter region of ColDof25, ColDof30, ColDof31 and ColDof32.Cold-responsive elements were found in 9 ColDof genes, among which ColDof12 and ColDof20 had the most cold-responsive elements(2).andstressresponse elements were found in 18 ColDof genes, among which ColDof2, ColDof23, ColDof24 and ColDof28 had the most defense and stress response elements(2).Endosperm expression elements were found in 17 ColDof genes, among which ColDof4 and ColDof14 promoter regions had the most expression elements (2).
are both less than 50%, indicating that AU is used more frequently than GC in the codon of the coding sequence of members of this family.Codon adaptation index (CAI) varied from 0.25 to 0.34, with an average value of 0.18.

Table 2
Gene duplication types and Ka/Ks analysis for duplicated gene pairs of ColDof genes